Styles of plots

Elizabeth King
Kevin Middleton

General caveat to keep in mind

Many ways to accomplish the same goal in R / ggplot.

  • Some might be faster than others
    • Unless you are working at large scale, it might not matter
    • Plotting 105 or 106 points? It might.
  • Some might be more or less convoluted
    • Pre-compute quantities or use a within-plot function?
  • Some might be more or less error-prone

Our way might not be the “best” way (for some definition of “best”).

Plots convey information

First think about what information you want to convey to yourself or the reader.

Kinds of data

  • Numerical / Quantitative
    • Continuous, discrete
  • Categorical / Qualitative
    • Unordered, ordered / nominal

The type of data gives clues about what plots will show, how you set them up, and what elements they include.

palmerpenguins

https://allisonhorst.github.io/palmerpenguins/

install.packages("palmerpenguins")

Look at penguins

library(palmerpenguins)
penguins <- penguins |> 
  mutate(sex = if_else(sex == "female", "Female", "Male"))
names(penguins)
[1] "species"           "island"            "bill_length_mm"   
[4] "bill_depth_mm"     "flipper_length_mm" "body_mass_g"      
[7] "sex"               "year"             

palmerpenguins

Mix of categorical (species, island, sex) and continuous (bill..., flipper_length_mm, body_mass_g) variables

  • Integer: year
  • Note: we don’t recommend including units in variable names
# A tibble: 344 × 8
   species island    bill_length_mm bill_depth_mm flipper_…¹ body_…² sex    year
   <fct>   <fct>              <dbl>         <dbl>      <int>   <int> <chr> <int>
 1 Adelie  Torgersen           39.1          18.7        181    3750 Male   2007
 2 Adelie  Torgersen           39.5          17.4        186    3800 Fema…  2007
 3 Adelie  Torgersen           40.3          18          195    3250 Fema…  2007
 4 Adelie  Torgersen           NA            NA           NA      NA <NA>   2007
 5 Adelie  Torgersen           36.7          19.3        193    3450 Fema…  2007
 6 Adelie  Torgersen           39.3          20.6        190    3650 Male   2007
 7 Adelie  Torgersen           38.9          17.8        181    3625 Fema…  2007
 8 Adelie  Torgersen           39.2          19.6        195    4675 Male   2007
 9 Adelie  Torgersen           34.1          18.1        193    3475 <NA>   2007
10 Adelie  Torgersen           42            20.2        190    4250 <NA>   2007
# … with 334 more rows, and abbreviated variable names ¹​flipper_length_mm,
#   ²​body_mass_g

palmerpenguins

Explore:

penguins |> 
  group_by(species, island, sex) |> 
  count()
# A tibble: 13 × 4
# Groups:   species, island, sex [13]
   species   island    sex        n
   <fct>     <fct>     <chr>  <int>
 1 Adelie    Biscoe    Female    22
 2 Adelie    Biscoe    Male      22
 3 Adelie    Dream     Female    27
 4 Adelie    Dream     Male      28
 5 Adelie    Dream     <NA>       1
 6 Adelie    Torgersen Female    24
 7 Adelie    Torgersen Male      23
 8 Adelie    Torgersen <NA>       5
 9 Chinstrap Dream     Female    34
10 Chinstrap Dream     Male      34
11 Gentoo    Biscoe    Female    58
12 Gentoo    Biscoe    Male      61
13 Gentoo    Biscoe    <NA>       5

A (not exhaustive) menu of plot elements (geoms)

  • points
  • lines (straight, curved)
  • bars (histograms)
  • densities
  • boxes
  • annotations

What information do I want to convey?

  • Associations between measurements
  • Differences in measurements between species
  • Differences in measurements between sexes (within species?)
  • Changes in measurements across time

Associations between measurements

Associations between measurements

Associations between measurements

Differences in measurements between species

Differences in measurements between sexes within species

Changes in measurements across time

Plots gone wrong